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We investigate and quantify the interplay between topology and ability to send specific signals 
in complex networks. We find that in a majority of investigated real-world networks the ability 
to communicate is favored by the network topology on small distances, but disfavored at larger 
distances. We further discuss how the ability to locate specific nodes can be improved if information 
associated to the overall traffic in the network is available. 
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Not all different parts interact directly with all other 
parts in a complex system. Rather each element inter- 
acts directly only with a few particular elements. Distant 
parts of the thereby formed network can consequently 
communicate through sequences of local interactions. In 
this way all parts of the network in principle can be 
reached from other parts, but not all such communica- 
tions are equally easy or accurate. The network is thus 
a description of the limited ability to send specific sig- 
nals in the system 0. We stress the difference between 
specific signaling in networks and the contrary unspe- 
cific broadcasting: Where specific signaling only focuses 
on locating one specific node without disturbing the re- 
maining network, the non-specific broadcasting amplifies 
by transfering signals to all exit links of every node along 
all branching paths. Specific signaling is thus construc- 
tive communication, whereas non-specific broadcasting 
rather is of relevance for disease spreading or computer 
virus propagation 

One can imagine various ways of searching a specific 
node in a network, dependent on the available informa- 
tion when the search is performed Q. In present paper 
we compare ways to guide the search based on locating 
the shortest path between a source and a target in the 
network. Thus we are only characterizing specific sig- 
naling, where any deviations from shortest paths mean 
the loss of the signal. In other words, the cost of devi- 
ating from the shortest path is assumed to be infinite, 
and we simply quantify the search in terms of the num- 
ber of questions needed to follow the shortest path to the 
target. 

First let us consider the Search Information introduced 
in . The Search Information of going from source node 
s to target node t, S{s — > t), is the number of bits of 
information one needs to go from s to t using the short- 
est paths: In the beginning, when starting at node s, 
one has to find the right exit \iiik,leadingto the second 
node on the shortest path to the target node t. We as- 
sume that each node is a simplistic autonomous system 
that knows which of its exit links that leads to the tar- 
get. The number of questions one has to ask such an 
autonomous system in a source node is log2(fcs), where 
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all links are equal 



links are weighted by traffic 



FIG. 1: Information measures on network topology: a) Search 
Information S{s — > t) measures your ability to locate node t 
from node s. b) Weighted Search Information Sw measures 
your ability to locate target node t from the source node s, 
when you tend to follow the traffic given by the betweenness 
bij. S{s — > t) is the number of yes/no questions needed to 
locate any of the shortest paths between node s and node 
t. For each such path P{p{s,t)) = -i- Y[j F"~ri with j 
counting nodes on the path p{s,t) until the last node before 
t. The factor fcj — 1 instead of kj takes into account the 
information gained by following the path. Sm(* —> j) is the 
similar quantity where we now weight each exit link from a 
node with its betweenness (3ik |^ |^|, defined as the fraction 
of messages that go through node I which also go through 
neighbor node k. 



kg is the degree of the source. At the subsequent node, j, 
along the shortest path to the target the number of ques- 
tions is reduced to log2(fcj — 1) since the incoming link 
is known. That means that the number of questions one 
has to ask when walking along the path from the source 
to the target is S{s — > i) = — log2(^ Hj F^)- there 
are more than one shortest path between s and t, then: 
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where the sum runs over the set {p(s,t)} of degenerate 
shortest paths between s and t, see Fig. Q In the previ- 
ous work we investigated S for a number of networks 



2 



City^ Internet CEO ^ Fly 




12345678 12345678 
/ / 



FIG. 2: Analysis of real world networks. Here City is the 
information city network of Malmo ;^ with roads mapped to 
nodes and intersections to links. The Internet refers to the 
hardwired networks of autonomous systems Il4j| . the CEO to 
the network of cooperate executives in US Q and Fly to the 
protein-protein network of Drosophilia Melanogaster ITo!]. In 
we compare S{1) with the similar search performed in a 
randomized version of the network. One observes that search 
on short distances Z ~ 2 — 3 is relatively optimized in the 
real networks. In bj we compare S with the search obtained 
when one uses the information associated to overall traffic in 
the network. We see that such global traffic information helps 
the search at all long distances. 

and found that one needs more information to orient in 
real- than in random networks. By random networks 
we mean the networks randomized by the reshuffling of 
links in such a way as to preserve the degree sequence 
and keep the network connected • To explore the na- 
ture of these complications in real world networks we, 
m Fig. 121 look at the average Search Information for 
nodes separated by / links and compare it with the cor- 
responding quantity in a randomized counterpart. From 
AS{1) — {S{1)) — {Srando7n{l)) we scc that essentially all 
the contribution to the global excess of AS' = S — Srandom 
comes from large distances I > 3 {S — ^ J2s t ^(^ ^ ^))- 
For some of the networks, as for example Internet, Yeast 
and Fly, the AS{1) is even negative at short distances, 
which implies that these real networks are organized to 
optimize the search at these short distances. Thus local 
specificity is favored whereas communication beyond the 
horizon / = Isearch ~ 3 is disfavored. 

To uncover how the topology and the search infor- 
mation, quantified by S, are coupled we, in Fig. [S] in- 
vestigate a number of model networks. Fig. Ofa) shows 
S{1) — Srandom(l) for the various model networks. We see 
that the search is easy at distance Z ~ 3 in the modular 
network, whereas a randomly rewired network provides 
better search options for / > 3. The hierarchical club 
network, on the other hand, clearly does worse than a 
random network on all scales. Here we have obtained a 
surprising and counterintuitive result that hierarchies are 
not always optimal for search. That (club)hierarchies are 
used in many human organizations may thus be seen as a 
way to regulate and thus limit the information exchange, 
rather than to optimize overall specific communication 

The search information S defined above is based on a 
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FIG. 3: Analysis of model networks in terms of the quantities 
in Fig. 121 The simple modular network is constructed of C 
modules with C nodes in each, with 0.2C connections between 
nodes in modules and 0.2C connections between the modules 
(C = VN)- The club tree is a hierarchical network with club 
structure at each level, a construction intended to mimic our 
view of social organization. We used a version with (fc) = 6 
neighbors per node and with ~ log{N) /log{{k) /2) hierarchical 
levels. The scale-free network is an example of networks with 
broad degree-distributions, here scaling as In all cases 

we simulate = 5000 node networks. 

minimal approach where one at each node knows nothing 
about the relative importance of the neighbors. However, 
in real social networks one often knows who is best con- 
nected to the rest of the system. For example in a mili- 
tary hierarchy, every soldier knows who their immediate 
superior is. This knowledge can be obtained self con- 
sistently at any node in any network by monitoring the 
traffic of orders past this node. In order to explore how 
the search can be simplified by additional knowledge we 
introduce a slightly different quantification of search in- 
formation. That is, we explore the information needed to 
search if one knows the overall traffic flow. When ques- 
tioning the minimal autonomous system at a node, we 
weight the questions according to the betweenness of the 
links to the node |^ . Thereby we define the weighted 
search information 

S^is^t)^ -log J ^ n ' (2) 

\{p(s,t)} jepis.t) J 

where j labels the node on the path p{s,t), starting at 
j = 1 for neighbor node to s. fejj+i = J2k is 

the betweenness of the link from node with label j to node 
with label j + 1, divided by the sum of the betweennesses 
of all k links from j. b'^ jj^i is similarly defined except 
that the normalization excludes the link to the preceding 
node of j on the shortest path between s and t. 

To understand the difference between S and Sw we 
consider a city (defined through the city network where 
each node is a road, and each link an intersection |^). 
By orienting yourself with the strategy behind S small 
and large roads are weighted equally. However, Sw cap- 
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tures that large roads more often take you closer to the 
target than small roads. For all investigated networks 
one in average gains by using the weighted search strat- 
egy. However, the contribution is not homogeneously dis- 
tributed over distance. As one can see in Fig. ^h) the 
weighted strategy is more efficient at longer distances, 
I > 3. However 5^, > S" for / < 3 and thus it turns out to 
be inefficient to follow the flow when the target is nearby. 
This reflects the fact that if you follow the flow you will 
nearly always overlook small roads in your neighborhood. 
In terms of navigating in a city, the Swil) — S{1) differ- 
ence shows that it pays off to follow the large roads until 
you are within a few turns from your end target. Then 
it naturally pays off to change strategy and disregard 
the main stream. The distance where Siu{l) — S{1) be- 
comes negative therefore deflnes a characteristic search 
horizon, kocai- global, at which one should switch from lo- 
cal to global search strategy. 

We next study the relative advantage of local versus 
global search strategies for some model networks in Fig. 
|3Ib). Like the real world networks, also the model net- 
works have Sw > S at small distances, and 5^, < S at 
large distances. In particular, the club tree (hierarchy) 
does extremely bad at short distances because there is 
a strong bias to go along the main flow, and one thus 
needs a lot of effort to locate peripheral neighbors. For a 
random scale-free network, on the other hand, the overall 
traffic very fast guides you to the center, and therefore 
Sw is a good search strategy at nearly all distances. The 
scale-free network represents topologies with very broad 
degree distributions and in these one nearly always ben- 
efit by following the flow Q . 

In between these two networks is the modular network, 
where the global flow confuses local search {S{1 < 3) < 
Sw{l < 3)), but helps traffic to other modules and thus 
to the more distant targets. Returning to the real world 
networks in Fig. |2Ib) their kocai-giobai 2 horizon for 
traffic guided search may be seen as a combination of a 
short kocai-giobai ~ 1 horizou associated to their broad 
degree distribution (scale free in Fig.l^Jb)), and a larger 
kocai-giobai horizou associated to modular or hierarchical 
features. 

One may ask whether the two search strategies can 
be combined, such that one uses local information for 
local search, and global traffic information for long dis- 
tance search. In terms of traffic in a city the picture 
is that there are multiple types of traffic, from pedes- 
trian to short distance targets, bicycles to intermediate 
distance targets, to cars for the distant targets. In accor- 
dance to this picture we introduce the limited between- 
ness measures bij{r) for the links j around a node i, 
defined by traffic between all pairs of nodes that only 
moves at maximum a distance r between the source and 
the target. Given this set of r dependent traffic weights, 
we in analogy with Eq. |2 define a set of search mea- 
sures S'^,(r)(0- Foi' T = 1, 5'„(i)(s t) = S{s t), 
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FIG. 4; Investigations of search strategies in a model net- 
work a) and information city network of Malmo h). The fig- 
ures illustrate 3 simple search strategies, and one optimized 
(shaded). S'uj(r.)(0 denotes the search information based only 
on traffic between nodes separated by not more than r steps 
in the network. We see that nodes on short distance are best 
found by using local search strategy {S^^r=i) ~ S, whereas 
search to distant nodes are best performed by using infor- 
mation from global traffic. However, nodes at intermediate 
distances are best found by using traffic between nodes at in- 
termediate distances. To optimize search we also show a dif- 
ferent search strategy 5'scaic-adjustcd, where one at each step 
j along the path to the target adjust the traffic horizon to 
the remaining distance to the target. We furthermore restrict 
the traffic bias to the subset of traffic that are targeted to the 
node j one is currently at (see Fig. |^ . 



whereas 5'tt,(r=oo)(s ^ <) = 5^(s ^ t) and thus S'^(r) 
naturally interpolates between the non-weighted and the 
traffic- weighted search approaches. In Fig.^a) we exam- 
ine S'u,(r)(0 the club-hierarchy network from Fig. 13 
In accordance with Fig.l^Ja) we again see that the longer 
distances indeed are best searched by using long distance 
traffic. In addition we see that intermediate distances 
1 = 3 — 7 are best searched by using a search weighted by 
traffic traveling intermediate distances r ~ 5, as quanti- 
fied by 5^(5) (0- 

Fig. Elb) shows the optimal search strategy in a real 
network, here the information city network of Malmo 
Again the search efficiency is improved by adjusting the 
traffic horizon to the search distance. In fact the search 
can be further optimized by, at each step, adjusting the 
traffic horizon r to the remaining distance to the tar- 
get. In the language of city networks, when searching 
a distant road, one first uses information from car traf- 
fic, but as distance to target becomes smaller than say 
5 intersections, one instead uses bicycle- and then sub- 
sequently pedestrian traffic. This overall feature of opti- 
mizing search works best when one weights the exit link 
from each node j by the fraction of overall traffic target 
explicitly to j. Thus, the optimal search indicated by the 
shaded area in Fig. ^ corresponds to a search strategy, 
where one at each step j from s to t bias the search ac- 
cording to the subpart of the traffic that is targeted at 
j, and which has a source at distance that are not fur- 
ther away than the target t (see Fig. [SJ . The difference 
to the normal betweenness is that the target between- 
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ness effectively partitions the network around each node 
j, such that each exit Unk is weighted by the fraction of 
the network that it leads to. This therefore provides a 
more efficient guess on the direction to the rest of the 
network from j than the normal betweenness. 




FIG. 5: Illustration of optimal search strategyon an informa- 
tion city network of Gamla stan, Stockholm |3]: At each step 
the contribution to the traffic bias is limited by sequentially 
decreasing horizons (circles). The radius of each horizon re- 
flects the node distance to the target. In case of the large 
city, each horizon corresponds to the type of transportation 
one should consider: Within the big circle one looks for cars 
and within the smallest circle for pedestrians. The path is 
indicated in black. 

Obviously the optimal search strategy can only be used 
if one has access to this distant-dependent traffic infor- 
mation. However, as in a city, such information can for 
example be quite well estimated in social networks. Con- 
sider Milgram's famous result of a mail locating a target 
person in a chain of t ypi cally six acquaintances between 
two persons in USA [l^]. The nontrivial result of Mil- 
gram's experiment is not that the distance between two 
persons is just six, since the dimension in social networks 
are high [l^l , but the fact that short paths were found in 
the experiment. In terms of our optimal search strategy, 
Milgram's experiment is interpreted the following way: 
Every participant that receives a mail aimed to a distant 
target person, gives this in his turn to a friend, with a 
chance weighted to how often this friend travels on dis- 
tances up to the scale of the target distance. With such 
a search strategy, that at each point along the path is 
adjusted to the horizon to the target, the mail will find 
a short path to the target person with high probability 
(low information cost). We speculate that humans inher- 
ently tend to use such a scale- free search strategy, and by 
this facilitate robust communication on all scales rang- 
ing from a single remote village to the whole planet. The 
information gain by doing so in the city Malmo is illus- 
trated by the difference between the black curve and the 



shaded area in Fig. ^Jb). 

In the present work we have quantified the information 
cost associated to transmission of specific signals across 
a complex network. By comparing real- and random net- 
works, we have shown that many real-world networks 
tend to have optimized searchability at rather short dis- 
tance I ^ 3. The cost of this optimization is that beyond 
this horizon one must use more intelligent methods to 
facilitate searchability. In the spirit of communication 
we have investigated methods based on global traffic ob- 
served at local level and interpreted them in real-world 
examples. 

In many networks, in particular social or traffic net- 
works, the search strategy can be adjusted according to 
average traffic flow. The distance at which global traf- 
fic becomes superior to unbiased search defines a horizon 
associated to the largest scale of modules in a network. 
In general, any network we have investigated are best 
searched by using the "scale invariant" strategy, where 
directions are selected according to the average traffic to 
nodes at distances similar to that of the searched target 
node. 
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The hierarchical search algorithms are however efl^ective, 
but only in the specific case of perfect tree-like hierarchy 
searched from the top. In that case the S measured from 
the top is ~ log2 {N + 1) -1-2/N + 'ilog2{N + 1)/{N{N + 
1)) « log2{N/2) which is a smaller search information 
than for any other organization of a network consisting 
of nodes. 
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